Project - Pneumonia Detection¶

CONTEXT:

Computer vision can be used in health care for identifying diseases. In Pneumonia detection we need to detect Inflammation of the lungs. In this challenge, you’re required to build an algorithm to detect a visual signal for pneumonia in medical images. Specifically, your algorithm needs to automatically locate lung opacities on chest radiographs.

DATA DESCRIPTION:

  • In the dataset, some of the features are labeled “Not Normal No Lung Opacity”. This extra third class indicates that while pneumonia was determined not to be present, there was nonetheless some type of abnormality on the image and oftentimes this finding may mimic the appearance of true pneumonia. Dicom original images: - Medical images are stored in a special format called DICOM files (*.dcm). They contain a combination of header metadata as well as underlying raw image arrays for pixel data.
  • Dataset has been attached along with this project. Please use the same for this capstone project.
  • Original link to the dataset : https://www.kaggle.com/c/rsna-pneumonia-detection-challenge/data [ for your reference only ]. You can refer to the details of the dataset in the above link
  • Acknowledgements: https://www.kaggle.com/c/rsna-pneumonia-detection-challenge/overview/acknowledgements.

DATASET:

The dataset contains the following files and folders:

  • stage_2_train_labels.csv - The training set. It contains patientIds and bounding box / target information.
  • stage_2_detailed_class_info.csv – It provides detailed information about the type of positive or negative class for each image.

Apart from the above-mentioned data files (in csv format), the dataset also contains the images folders

  • stage_2_train_images
  • stage_2_test_images

The images in the above-mentioned folders are stored in a special format called DICOM files (*.dcm). They contain a combination of header metadata as well as underlying raw image arrays for pixel data.

Objective¶

Milestone 1: The objective of this notebook is to do Pre-Processing, Data Visualization and EDA.

Mounted at /content/drive
Collecting pydicom
  Downloading pydicom-2.4.4-py3-none-any.whl (1.8 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.8/1.8 MB 6.9 MB/s eta 0:00:00
Installing collected packages: pydicom
Successfully installed pydicom-2.4.4

Import Packages¶

Preparation of Train Dataset¶

Reading CSV files¶

First five rows of Training set:
                               patientId      x      y  width  height  Target
0  0004cfab-14fd-4e49-80ba-63a80b6bddd6    NaN    NaN    NaN     NaN       0
1  00313ee0-9eaa-42f4-b0ab-c148ed3241cd    NaN    NaN    NaN     NaN       0
2  00322d4d-1c29-4943-afc9-b6754be640eb    NaN    NaN    NaN     NaN       0
3  003d8fa0-6bf1-40ed-b54c-ac657f8495c5    NaN    NaN    NaN     NaN       0
4  00436515-870c-4b36-a041-de91049b9ab4  264.0  152.0  213.0   379.0       1

Some information about the data field present in the 'stage_2_train_labels.csv' are:

  • patientId - A patientId.
  • x - The x coordinate of the bounding box
  • y - The y coordinate of the bounding box
  • width - The width of the bounding box
  • height - The height of the bounding box
  • Target - The binary Target indicating whether this sample has evidence of pneumonia or not.
The train_label dataframe has 30227 rows and 6 columns.
Number of unique patientId are: 26684

Thus, the dataset contains information about 26684 patients. Out of these 26684 patients, some of them have multiple entries in the dataset.

No of entries which has Pneumonia: 9555 ~ 32.0%
No of entries which don't have Pneumonia: 20672 ~ 68.0%
No description has been provided for this image

Thus, from the above pie chart it is clear that out of 30227 entries in the dataset, there are 20672 (i.e., 68%) entries in the dataset which corresponds to the entries of the patient Not having Pnuemonia whereas 9555 (i.e., 32%) entries corresponds to Positive case of Pneumonia.

Number of nulls in bounding box columns: {'x': 20672, 'y': 20672, 'width': 20672, 'height': 20672}

Thus, we can see that number of nulls in bounding box columns are equal to the number of 0's we have in the Target column.

Thus, there are 23286 unique patients which have only one entry in the dataset and so on.

  • stage_2_detailed_class_info.csv

It provides detailed information about the type of positive or negative class for each image.

First five rows of Class label dataset are:
                               patientId                         class
0  0004cfab-14fd-4e49-80ba-63a80b6bddd6  No Lung Opacity / Not Normal
1  00313ee0-9eaa-42f4-b0ab-c148ed3241cd  No Lung Opacity / Not Normal
2  00322d4d-1c29-4943-afc9-b6754be640eb  No Lung Opacity / Not Normal
3  003d8fa0-6bf1-40ed-b54c-ac657f8495c5                        Normal
4  00436515-870c-4b36-a041-de91049b9ab4                  Lung Opacity

Some information about the data field present in the 'stage_2_detailed_class_info.csv' are:

  • patientId - A patientId.
  • class - Have three values depending what is the current state of the patient's lung: 'No Lung Opacity / Not Normal', 'Normal' and 'Lung Opacity'.
The class_label dataframe has 30227 rows and 2 columns.
Number of unique patientId are: 26684

Thus, the dataset contains information about 26684 patients (which is same as that of the train_labels dataframe).

Feature: class
No Lung Opacity / Not Normal  : 11821 which is 39.1% of the total data in the dataset
Lung Opacity                  : 9555 which is 31.61% of the total data in the dataset
Normal                        : 8851 which is 29.28% of the total data in the dataset
No description has been provided for this image
Number of nulls in class columns: 0

Thus, none of the columns in class_labels has an empty row.

1

Thus, we can say that each patientId is associated with only 1 class.

After merging, the dataset looks like: 

patientId x y width height Target number_of_boxes class
0 0004cfab-14fd-4e49-80ba-63a80b6bddd6 NaN NaN NaN NaN 0 1 No Lung Opacity / Not Normal
1 00313ee0-9eaa-42f4-b0ab-c148ed3241cd NaN NaN NaN NaN 0 1 No Lung Opacity / Not Normal
2 00322d4d-1c29-4943-afc9-b6754be640eb NaN NaN NaN NaN 0 1 No Lung Opacity / Not Normal
3 003d8fa0-6bf1-40ed-b54c-ac657f8495c5 NaN NaN NaN NaN 0 1 Normal
4 00436515-870c-4b36-a041-de91049b9ab4 264.0 152.0 213.0 379.0 1 2 Lung Opacity
After merge, the dataset has 30227 rows and 8 columns.

Target and Class¶

Text(0.5, 1.0, 'Class and Target for Chest Exams')
No description has been provided for this image

Thus, Target = 1 is associated with only class = Lung Opacity whereas Target = 0 is associated with only class = No Lung Opacity / Not Normal as well as Normal.

Bounding Box Distribution¶

No description has been provided for this image

Thus, we can see that the centers for the bounding box are spread out evenly across the Lungs. Though a large portion of the bounding box have their centers at the centers of the Lung, but some centers of the box are also located at the edges of lung.

Reading Images¶

Images provided are stored in DICOM (.dcm) format which is an international standard to transmit, store, retrieve, print, process, and display medical imaging information. Digital Imaging and Communications in Medicine (DICOM) makes medical imaging information interoperable. We will make use of pydicom package here to read the images.

Metadata of the image consists of 
 Dataset.file_meta -------------------------------
(0002, 0000) File Meta Information Group Length  UL: 202
(0002, 0001) File Meta Information Version       OB: b'\x00\x01'
(0002, 0002) Media Storage SOP Class UID         UI: Secondary Capture Image Storage
(0002, 0003) Media Storage SOP Instance UID      UI: 1.2.276.0.7230010.3.1.4.8323329.28530.1517874485.775526
(0002, 0010) Transfer Syntax UID                 UI: JPEG Baseline (Process 1)
(0002, 0012) Implementation Class UID            UI: 1.2.276.0.7230010.3.0.3.6.0
(0002, 0013) Implementation Version Name         SH: 'OFFIS_DCMTK_360'
-------------------------------------------------
(0008, 0005) Specific Character Set              CS: 'ISO_IR 100'
(0008, 0016) SOP Class UID                       UI: Secondary Capture Image Storage
(0008, 0018) SOP Instance UID                    UI: 1.2.276.0.7230010.3.1.4.8323329.28530.1517874485.775526
(0008, 0020) Study Date                          DA: '19010101'
(0008, 0030) Study Time                          TM: '000000.00'
(0008, 0050) Accession Number                    SH: ''
(0008, 0060) Modality                            CS: 'CR'
(0008, 0064) Conversion Type                     CS: 'WSD'
(0008, 0090) Referring Physician's Name          PN: ''
(0008, 103e) Series Description                  LO: 'view: PA'
(0010, 0010) Patient's Name                      PN: '0004cfab-14fd-4e49-80ba-63a80b6bddd6'
(0010, 0020) Patient ID                          LO: '0004cfab-14fd-4e49-80ba-63a80b6bddd6'
(0010, 0030) Patient's Birth Date                DA: ''
(0010, 0040) Patient's Sex                       CS: 'F'
(0010, 1010) Patient's Age                       AS: '51'
(0018, 0015) Body Part Examined                  CS: 'CHEST'
(0018, 5101) View Position                       CS: 'PA'
(0020, 000d) Study Instance UID                  UI: 1.2.276.0.7230010.3.1.2.8323329.28530.1517874485.775525
(0020, 000e) Series Instance UID                 UI: 1.2.276.0.7230010.3.1.3.8323329.28530.1517874485.775524
(0020, 0010) Study ID                            SH: ''
(0020, 0011) Series Number                       IS: '1'
(0020, 0013) Instance Number                     IS: '1'
(0020, 0020) Patient Orientation                 CS: ''
(0028, 0002) Samples per Pixel                   US: 1
(0028, 0004) Photometric Interpretation          CS: 'MONOCHROME2'
(0028, 0010) Rows                                US: 1024
(0028, 0011) Columns                             US: 1024
(0028, 0030) Pixel Spacing                       DS: [0.14300000000000002, 0.14300000000000002]
(0028, 0100) Bits Allocated                      US: 8
(0028, 0101) Bits Stored                         US: 8
(0028, 0102) High Bit                            US: 7
(0028, 0103) Pixel Representation                US: 0
(0028, 2110) Lossy Image Compression             CS: '01'
(0028, 2114) Lossy Image Compression Method      CS: 'ISO_10918_1'
(7fe0, 0010) Pixel Data                          OB: Array of 142006 elements

From the above sample we can see that dicom file contains some of the information that can be used for further analysis such as sex, age, body part examined, view position and modality. Size of this image is 1024 x 1024 (rows x columns).

Number of images in training images folders are: 26684.

Thus, we can see that in the training images folder we have just 26684 images which is same as that of unique patientId's present in either of the csv files. Thus, we can say that each of the unique patientId's present in either of the csv files corresponds to an image present in the folder.

Columns in the training images dataframe: ['path', 'patientId']
After merging the two dataframe, the training_data has 30227 rows and 9 columns.
The training_data dataframe as of now - 

patientId x y width height Target number_of_boxes class path
0 0004cfab-14fd-4e49-80ba-63a80b6bddd6 NaN NaN NaN NaN 0 1 No Lung Opacity / Not Normal /content/drive/MyDrive/zipOut/stage_2_train_im...
1 00313ee0-9eaa-42f4-b0ab-c148ed3241cd NaN NaN NaN NaN 0 1 No Lung Opacity / Not Normal /content/drive/MyDrive/zipOut/stage_2_train_im...
2 00322d4d-1c29-4943-afc9-b6754be640eb NaN NaN NaN NaN 0 1 No Lung Opacity / Not Normal /content/drive/MyDrive/zipOut/stage_2_train_im...
3 003d8fa0-6bf1-40ed-b54c-ac657f8495c5 NaN NaN NaN NaN 0 1 Normal /content/drive/MyDrive/zipOut/stage_2_train_im...
4 00436515-870c-4b36-a041-de91049b9ab4 264.0 152.0 213.0 379.0 1 2 Lung Opacity /content/drive/MyDrive/zipOut/stage_2_train_im...
0it [00:00, ?it/s]
So after parsing the information from the dicom images, our training_data dataframe has 30227 rows and 18 columns and it looks like:

patientId x y width height Target number_of_boxes class path Modality PatientAge PatientSex BodyPartExamined ViewPosition ConversionType Rows Columns PixelSpacing
0 0004cfab-14fd-4e49-80ba-63a80b6bddd6 NaN NaN NaN NaN 0 1 No Lung Opacity / Not Normal /content/drive/MyDrive/zipOut/stage_2_train_im... CR 51 F CHEST PA WSD 1024 1024 0.143
1 00313ee0-9eaa-42f4-b0ab-c148ed3241cd NaN NaN NaN NaN 0 1 No Lung Opacity / Not Normal /content/drive/MyDrive/zipOut/stage_2_train_im... CR 48 F CHEST PA WSD 1024 1024 0.194
2 00322d4d-1c29-4943-afc9-b6754be640eb NaN NaN NaN NaN 0 1 No Lung Opacity / Not Normal /content/drive/MyDrive/zipOut/stage_2_train_im... CR 19 M CHEST AP WSD 1024 1024 0.168
3 003d8fa0-6bf1-40ed-b54c-ac657f8495c5 NaN NaN NaN NaN 0 1 Normal /content/drive/MyDrive/zipOut/stage_2_train_im... CR 28 M CHEST PA WSD 1024 1024 0.143
4 00436515-870c-4b36-a041-de91049b9ab4 264.0 152.0 213.0 379.0 1 2 Lung Opacity /content/drive/MyDrive/zipOut/stage_2_train_im... CR 32 F CHEST AP WSD 1024 1024 0.139

Going forward we will now use this pickle file as our training data.

EDA on this saved training data:

Our training data consists of 30227 rows and 18 columns and looks like as: 

patientId x y width height Target number_of_boxes class path Modality PatientAge PatientSex BodyPartExamined ViewPosition ConversionType Rows Columns PixelSpacing
0 0004cfab-14fd-4e49-80ba-63a80b6bddd6 NaN NaN NaN NaN 0 1 No Lung Opacity / Not Normal /content/drive/MyDrive/zipOut/stage_2_train_im... CR 51 F CHEST PA WSD 1024 1024 0.143
1 00313ee0-9eaa-42f4-b0ab-c148ed3241cd NaN NaN NaN NaN 0 1 No Lung Opacity / Not Normal /content/drive/MyDrive/zipOut/stage_2_train_im... CR 48 F CHEST PA WSD 1024 1024 0.194
2 00322d4d-1c29-4943-afc9-b6754be640eb NaN NaN NaN NaN 0 1 No Lung Opacity / Not Normal /content/drive/MyDrive/zipOut/stage_2_train_im... CR 19 M CHEST AP WSD 1024 1024 0.168
3 003d8fa0-6bf1-40ed-b54c-ac657f8495c5 NaN NaN NaN NaN 0 1 Normal /content/drive/MyDrive/zipOut/stage_2_train_im... CR 28 M CHEST PA WSD 1024 1024 0.143
4 00436515-870c-4b36-a041-de91049b9ab4 264.0 152.0 213.0 379.0 1 2 Lung Opacity /content/drive/MyDrive/zipOut/stage_2_train_im... CR 32 F CHEST AP WSD 1024 1024 0.139

Modality¶

Modality for the images obtained is: CR 

Body Part Examined¶

The images obtained are of CHEST areas.

Understanding Different Positions¶

Feature: ViewPosition
AP                            : 15297 which is 50.6% of the total data in the dataset
PA                            : 14930 which is 49.39% of the total data in the dataset

As seen above, two View Positions that are in the training dataset are AP (Anterior/Posterior) and PA (Posterior/Anterior). These type of X-rays are mostly used to obtain the front-view. Apart from front-view, a lateral image is usually taken to complement the front-view.

  • Posterior/Anterior (PA): Here the chest radiograph is acquired by passing the X-Ray beam from the patient's posterior (back) part of the chest to the anterior (front) part. While obtaining the image patient is asked to stand with their chest against the film. In this image, the hear is on the right side of the image as one looks at it. These are of higher quality and assess the heart size more accurately
  • Anterior/Posterior (AP): At times it is not possible for radiographers to acquire a PA chest X-ray. This is usually because the patient is too unwell to stand. In these images the size of Heart is exaggerated.
The distribution of View Position when there is an evidence of Pneumonia:

No description has been provided for this image
Plot x and y centers of bounding box
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image

We can see that the centers of the box are spread across the entire region of the Lungs. Both of the cases (PA and AP) seem to have outliers in them.

Conversion Type¶

Conversion Type for the data in Training Data:  WSD

Rows and Columns¶

The training images has 1024 rows and 1024 columns.

Patient Age¶

The minimum and maximum recorded age of the patients are 1 and 155 respectively.
The number of outliers in `PatientAge` out of 30277 records are:  5

The ages which are in the outlier categories are: [148, 151, 153, 150, 155]
Text(0.5, 1.0, 'Outliers in PatientAge')
No description has been provided for this image

Thus, we can say that the ages like 148, 150, 151, 153 and 155 are mistakes. We can trim these outlier values to a somewhat lower value say 100 so that the max age of the patient will be 100.

In order to have a more clear idea, we will introduce a new column where the patients will be placed in an age group like (0, 10), (10, 20) etc.

Removing the outliers from `PatientAge`
count     30227
unique       93
top          58
freq        955
Name: PatientAge, dtype: int64
Distribution of `PatientAge`: Overall and Target = 1
No description has been provided for this image
PatientAgeBins
(50.0, 60.0]     7446
(40.0, 50.0]     5671
(60.0, 70.0]     4730
(30.0, 40.0]     4551
(20.0, 30.0]     3704
(10.0, 20.0]     1688
(70.0, 80.0]     1637
(0.0, 10.0]       515
(80.0, 90.0]      275
(90.0, 100.0]      10
Name: count, dtype: int64

Thus, we can see that the maximum number of patients belong to the age group of [50, 60] whereas the least belong to [90, 100]

After adding the bin column, the dataset turns out to be:

patientId x y width height Target number_of_boxes class path Modality PatientAge PatientSex BodyPartExamined ViewPosition ConversionType Rows Columns PixelSpacing PatientAgeBins
0 0004cfab-14fd-4e49-80ba-63a80b6bddd6 NaN NaN NaN NaN 0 1 No Lung Opacity / Not Normal /content/drive/MyDrive/zipOut/stage_2_train_im... CR 51 F CHEST PA WSD 1024 1024 0.143 (50.0, 60.0]
1 00313ee0-9eaa-42f4-b0ab-c148ed3241cd NaN NaN NaN NaN 0 1 No Lung Opacity / Not Normal /content/drive/MyDrive/zipOut/stage_2_train_im... CR 48 F CHEST PA WSD 1024 1024 0.194 (40.0, 50.0]
2 00322d4d-1c29-4943-afc9-b6754be640eb NaN NaN NaN NaN 0 1 No Lung Opacity / Not Normal /content/drive/MyDrive/zipOut/stage_2_train_im... CR 19 M CHEST AP WSD 1024 1024 0.168 (10.0, 20.0]
3 003d8fa0-6bf1-40ed-b54c-ac657f8495c5 NaN NaN NaN NaN 0 1 Normal /content/drive/MyDrive/zipOut/stage_2_train_im... CR 28 M CHEST PA WSD 1024 1024 0.143 (20.0, 30.0]
4 00436515-870c-4b36-a041-de91049b9ab4 264.0 152.0 213.0 379.0 1 2 Lung Opacity /content/drive/MyDrive/zipOut/stage_2_train_im... CR 32 F CHEST AP WSD 1024 1024 0.139 (30.0, 40.0]

From the above three plots we can infer that the maximum percentage of Male and Females, Patients with “No Lung Opacity / Not Normal” as well as “Lung Opacity” classes and Patients having Pneumonia all lies in the [50, 60] age group.

No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image

Plotting DICOM Images¶

  • Target = 0
No description has been provided for this image

As the above subplots are of the images which belong to either "Normal" or "No Lung Opacity / Not Normal", hence no bounding box is observed.

  • Target = 1
No description has been provided for this image

In the above subplots, we can see that the area covered by the box (in blue colour) depicts the area of interest i.e., the area in which the opacity is observed in the Lungs.

Conclusion¶

  • The training dataset (both of the csv files and the training image folder) contains information of 26684 patients (unique)
  • Out of these 26684 unique patients some of these have multiple entries in the both of the csv files
  • Most of the recorded patient belong to Target = 0 (i.e., they don't have Pneumonia)
  • Some of the patients have more than one bounding box. The maximum being 4
  • The classes "No Lung Opacity / Not Normal" and "Normal" is associated with Target = 0 whereas "Lung Opacity" belong to Target = 1
  • The images are present in dicom format, from which information like PatientAge, PatientSex, ViewPosition etc are obtained
  • There are two ways from which images were obtained: AP and PA. The age ranges from 1-155 (which were further clipped to 100)
  • The centers of the bounding box are spread out over the entire region of the lungs. But there are some centers which are outliers.

Design, train and test basic CNN models for classification¶

26684it [07:09, 62.10it/s] 

Pre Processing the image¶

(30227, 128, 128, 3) (30227,)

Scikitlearn suggests using OneHotEncoder for X matrix i.e. the features you feed in a model, and to use a LabelBinarizer for the y labels.

CNN Model

Model Summary

WARNING:absl:`lr` is deprecated in Keras optimizer, please use `learning_rate` or use the legacy optimizer, e.g.,tf.keras.optimizers.legacy.Adam.
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 conv2d (Conv2D)             (None, 128, 128, 32)      896       
                                                                 
 conv2d_1 (Conv2D)           (None, 128, 128, 32)      9248      
                                                                 
 max_pooling2d (MaxPooling2  (None, 64, 64, 32)        0         
 D)                                                              
                                                                 
 dropout (Dropout)           (None, 64, 64, 32)        0         
                                                                 
 conv2d_2 (Conv2D)           (None, 64, 64, 64)        18496     
                                                                 
 conv2d_3 (Conv2D)           (None, 64, 64, 64)        36928     
                                                                 
 max_pooling2d_1 (MaxPoolin  (None, 32, 32, 64)        0         
 g2D)                                                            
                                                                 
 dropout_1 (Dropout)         (None, 32, 32, 64)        0         
                                                                 
 conv2d_4 (Conv2D)           (None, 32, 32, 128)       73856     
                                                                 
 conv2d_5 (Conv2D)           (None, 32, 32, 128)       147584    
                                                                 
 max_pooling2d_2 (MaxPoolin  (None, 16, 16, 128)       0         
 g2D)                                                            
                                                                 
 dropout_2 (Dropout)         (None, 16, 16, 128)       0         
                                                                 
 global_max_pooling2d (Glob  (None, 128)               0         
 alMaxPooling2D)                                                 
                                                                 
 dense (Dense)               (None, 256)               33024     
                                                                 
 dropout_3 (Dropout)         (None, 256)               0         
                                                                 
 dense_1 (Dense)             (None, 3)                 771       
                                                                 
=================================================================
Total params: 320803 (1.22 MB)
Trainable params: 320803 (1.22 MB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________

Training for 20 epocs with batch size of 16

Epoch 1/20
1323/1323 [==============================] - 43s 26ms/step - loss: 1.1691 - accuracy: 0.4095 - val_loss: 1.0307 - val_accuracy: 0.4404
Epoch 2/20
1323/1323 [==============================] - 33s 25ms/step - loss: 1.0226 - accuracy: 0.4485 - val_loss: 1.0039 - val_accuracy: 0.4580
Epoch 3/20
1323/1323 [==============================] - 33s 25ms/step - loss: 1.0036 - accuracy: 0.4618 - val_loss: 0.9891 - val_accuracy: 0.4675
Epoch 4/20
1323/1323 [==============================] - 33s 25ms/step - loss: 0.9943 - accuracy: 0.4741 - val_loss: 0.9966 - val_accuracy: 0.4712
Epoch 5/20
1323/1323 [==============================] - 33s 25ms/step - loss: 0.9950 - accuracy: 0.4800 - val_loss: 0.9769 - val_accuracy: 0.4931
Epoch 6/20
1323/1323 [==============================] - 33s 25ms/step - loss: 0.9825 - accuracy: 0.4851 - val_loss: 0.9747 - val_accuracy: 0.5052
Epoch 7/20
1323/1323 [==============================] - 33s 25ms/step - loss: 0.9979 - accuracy: 0.4711 - val_loss: 0.9738 - val_accuracy: 0.4988
Epoch 8/20
1323/1323 [==============================] - 34s 25ms/step - loss: 0.9847 - accuracy: 0.4825 - val_loss: 0.9832 - val_accuracy: 0.4990
Epoch 9/20
1323/1323 [==============================] - 33s 25ms/step - loss: 0.9833 - accuracy: 0.4846 - val_loss: 0.9837 - val_accuracy: 0.4933
Epoch 10/20
1323/1323 [==============================] - 33s 25ms/step - loss: 0.9766 - accuracy: 0.4886 - val_loss: 0.9859 - val_accuracy: 0.4842
Epoch 11/20
1323/1323 [==============================] - 33s 25ms/step - loss: 0.9787 - accuracy: 0.4886 - val_loss: 0.9767 - val_accuracy: 0.4964
Epoch 12/20
1323/1323 [==============================] - 33s 25ms/step - loss: 0.9785 - accuracy: 0.4868 - val_loss: 0.9583 - val_accuracy: 0.5133
Epoch 13/20
1323/1323 [==============================] - 33s 25ms/step - loss: 0.9601 - accuracy: 0.5069 - val_loss: 0.9554 - val_accuracy: 0.5275
Epoch 14/20
1323/1323 [==============================] - 33s 25ms/step - loss: 0.9736 - accuracy: 0.4892 - val_loss: 0.9887 - val_accuracy: 0.4736
Epoch 15/20
1323/1323 [==============================] - 33s 25ms/step - loss: 0.9754 - accuracy: 0.4908 - val_loss: 0.9645 - val_accuracy: 0.5085
Epoch 16/20
1323/1323 [==============================] - 33s 25ms/step - loss: 0.9690 - accuracy: 0.4956 - val_loss: 0.9841 - val_accuracy: 0.4873
Epoch 17/20
1323/1323 [==============================] - 33s 25ms/step - loss: 0.9729 - accuracy: 0.4909 - val_loss: 0.9681 - val_accuracy: 0.5213
Epoch 18/20
1323/1323 [==============================] - 33s 25ms/step - loss: 0.9679 - accuracy: 0.4994 - val_loss: 0.9551 - val_accuracy: 0.5107
Epoch 19/20
1323/1323 [==============================] - 34s 25ms/step - loss: 0.9635 - accuracy: 0.5073 - val_loss: 0.9552 - val_accuracy: 0.5116
Epoch 20/20
1323/1323 [==============================] - 33s 25ms/step - loss: 0.9577 - accuracy: 0.5073 - val_loss: 0.9401 - val_accuracy: 0.5259
Epoch 1/20
1323/1323 [==============================] - 34s 25ms/step - loss: 0.9881 - accuracy: 0.4836 - val_loss: 0.9777 - val_accuracy: 0.4957 - lr: 0.0010
Epoch 2/20
1323/1323 [==============================] - 33s 25ms/step - loss: 0.9708 - accuracy: 0.4938 - val_loss: 0.9594 - val_accuracy: 0.5050 - lr: 0.0010
Epoch 3/20
1323/1323 [==============================] - 33s 25ms/step - loss: 0.9654 - accuracy: 0.5005 - val_loss: 0.9576 - val_accuracy: 0.4873 - lr: 0.0010
Epoch 4/20
1323/1323 [==============================] - 33s 25ms/step - loss: 0.9645 - accuracy: 0.5006 - val_loss: 0.9526 - val_accuracy: 0.5034 - lr: 0.0010
Epoch 5/20
1323/1323 [==============================] - 33s 25ms/step - loss: 0.9634 - accuracy: 0.4979 - val_loss: 0.9496 - val_accuracy: 0.5103 - lr: 0.0010
Epoch 6/20
1323/1323 [==============================] - 33s 25ms/step - loss: 0.9651 - accuracy: 0.5014 - val_loss: 0.9565 - val_accuracy: 0.5078 - lr: 0.0010
Epoch 7/20
1323/1323 [==============================] - 33s 25ms/step - loss: 0.9735 - accuracy: 0.4967 - val_loss: 0.9524 - val_accuracy: 0.5125 - lr: 0.0010
Epoch 8/20
1323/1323 [==============================] - 33s 25ms/step - loss: 0.9672 - accuracy: 0.4999 - val_loss: 0.9521 - val_accuracy: 0.5083 - lr: 0.0010

Training accuracy is around 49.99 percent whereas validation accuracy is round 50.83 percent. We have avoided overfitting, but it seems to be clear that a normal CNN will not help us.

142/142 [==============================] - 1s 10ms/step - loss: 0.9490 - accuracy: 0.5143
Test loss: 0.9489631056785583
Test accuracy: 0.514336109161377
No description has been provided for this image
Method accuracy Test Score
0 CNN 0.499905 0.514336
142/142 [==============================] - 1s 9ms/step
No description has been provided for this image
# This is formatted as code

The Metrics indcates the class 0 & 2 are being predicted well by the model, however class 1 requires further improvement.

142/142 [==============================] - 1s 9ms/step
Method accuracy Test Score 1_precision 1_recall 1_f1-score 1_support
0 CNN 0.499905 0.514336 0.44186 0.32853 0.37686 1735

The Model requires further fine tuning, to have even more precise predictions on detecting the Pnemonia.

Training Accuracy: Slightly increasing over epochs, indicating the model is learning from the training data.

Training Loss: Gradually decreasing, indicating the model is effectively learning.

Validation Loss: Decreasing overall, but with fluctuations, suggesting the model is improving but may still be overfitting slightly or learning at an unstable rate.Need to increase data or use augmentation to improve generalization.

Milestone 2 - Training and tuning Models for Pneumonia Detection¶

Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ conv2d (Conv2D)                 │ (None, 510, 510, 32)   │           320 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ max_pooling2d (MaxPooling2D)    │ (None, 255, 255, 32)   │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ conv2d_1 (Conv2D)               │ (None, 253, 253, 64)   │        18,496 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ max_pooling2d_1 (MaxPooling2D)  │ (None, 126, 126, 64)   │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ conv2d_2 (Conv2D)               │ (None, 124, 124, 128)  │        73,856 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ max_pooling2d_2 (MaxPooling2D)  │ (None, 62, 62, 128)    │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ flatten (Flatten)               │ (None, 492032)         │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense (Dense)                   │ (None, 512)            │   251,920,896 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_1 (Dense)                 │ (None, 1)              │           513 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 252,014,081 (961.36 MB)
 Trainable params: 252,014,081 (961.36 MB)
 Non-trainable params: 0 (0.00 B)
Epoch 1/5
2024-06-21 18:12:45.702711: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 0: 3.19583, expected 2.2314
2024-06-21 18:12:45.702780: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 1: 2.61042, expected 1.646
2024-06-21 18:12:45.702789: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 2: 2.56807, expected 1.60364
2024-06-21 18:12:45.702797: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 3: 3.14259, expected 2.17816
2024-06-21 18:12:45.702805: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 4: 2.8223, expected 1.85788
2024-06-21 18:12:45.702812: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 5: 3.01424, expected 2.04982
2024-06-21 18:12:45.702820: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 6: 3.73762, expected 2.7732
2024-06-21 18:12:45.702827: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 7: 2.53686, expected 1.57243
2024-06-21 18:12:45.702835: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 8: 2.18258, expected 1.21815
2024-06-21 18:12:45.702842: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 9: 3.01805, expected 2.05362
2024-06-21 18:12:45.717155: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:705] Results mismatch between different convolution algorithms. This is likely a bug/unexpected loss of precision in cudnn.
(f32[4,32,510,510]{3,2,1,0}, u8[0]{0}) custom-call(f32[4,1,512,512]{3,2,1,0}, f32[32,1,3,3]{3,2,1,0}, f32[32]{0}), window={size=3x3}, dim_labels=bf01_oi01->bf01, custom_call_target="__cudnn$convBiasActivationForward", backend_config={"conv_result_scale":1,"activation_mode":"kRelu","side_input_scale":0,"leakyrelu_alpha":0} for eng20{k2=2,k4=1,k5=1,k6=0,k7=0} vs eng15{k5=1,k6=0,k7=1,k10=1}
2024-06-21 18:12:45.717196: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:270] Device: Tesla P100-PCIE-16GB
2024-06-21 18:12:45.717204: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:271] Platform: Compute Capability 6.0
2024-06-21 18:12:45.717211: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:272] Driver: 12020 (535.129.3)
2024-06-21 18:12:45.717217: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:273] Runtime: <undefined>
2024-06-21 18:12:45.717233: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:280] cudnn version: 8.9.0
2024-06-21 18:12:46.174664: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 0: 3.19583, expected 2.2314
2024-06-21 18:12:46.174729: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 1: 2.61042, expected 1.646
2024-06-21 18:12:46.174738: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 2: 2.56807, expected 1.60364
2024-06-21 18:12:46.174756: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 3: 3.14259, expected 2.17816
2024-06-21 18:12:46.174768: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 4: 2.8223, expected 1.85788
2024-06-21 18:12:46.174779: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 5: 3.01424, expected 2.04982
2024-06-21 18:12:46.174793: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 6: 3.73762, expected 2.7732
2024-06-21 18:12:46.174806: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 7: 2.53686, expected 1.57243
2024-06-21 18:12:46.174815: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 8: 2.18258, expected 1.21815
2024-06-21 18:12:46.174822: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 9: 3.01805, expected 2.05362
2024-06-21 18:12:46.189258: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:705] Results mismatch between different convolution algorithms. This is likely a bug/unexpected loss of precision in cudnn.
(f32[4,32,510,510]{3,2,1,0}, u8[0]{0}) custom-call(f32[4,1,512,512]{3,2,1,0}, f32[32,1,3,3]{3,2,1,0}, f32[32]{0}), window={size=3x3}, dim_labels=bf01_oi01->bf01, custom_call_target="__cudnn$convBiasActivationForward", backend_config={"conv_result_scale":1,"activation_mode":"kRelu","side_input_scale":0,"leakyrelu_alpha":0} for eng20{k2=2,k4=1,k5=1,k6=0,k7=0} vs eng15{k5=1,k6=0,k7=1,k10=1}
2024-06-21 18:12:46.189297: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:270] Device: Tesla P100-PCIE-16GB
2024-06-21 18:12:46.189306: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:271] Platform: Compute Capability 6.0
2024-06-21 18:12:46.189313: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:272] Driver: 12020 (535.129.3)
2024-06-21 18:12:46.189320: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:273] Runtime: <undefined>
2024-06-21 18:12:46.189342: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:280] cudnn version: 8.9.0
   3/5441 ━━━━━━━━━━━━━━━━━━━━ 6:12 69ms/step - accuracy: 0.2917 - loss: 22.7386    
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1718993571.450613     103 device_compiler.h:186] Compiled cluster using XLA!  This line is logged at most once for the lifetime of the process.
W0000 00:00:1718993571.469478     103 graph_launch.cc:671] Fallback to op-by-op mode because memset node breaks graph update
1116/5441 ━━━━━━━━━━━━━━━━━━━━ 4:49 67ms/step - accuracy: 0.6545 - loss: 2.7868
2024-06-21 18:14:06.551561: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 0: 3.03991, expected 2.3779
2024-06-21 18:14:06.551622: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 1: 2.84086, expected 2.17885
2024-06-21 18:14:06.551632: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 2: 2.82741, expected 2.1654
2024-06-21 18:14:06.551641: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 3: 2.87941, expected 2.2174
2024-06-21 18:14:06.551648: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 4: 3.6931, expected 3.03109
2024-06-21 18:14:06.551656: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 5: 3.9916, expected 3.32959
2024-06-21 18:14:06.551663: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 6: 3.02855, expected 2.36654
2024-06-21 18:14:06.551671: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 7: 3.01353, expected 2.35152
2024-06-21 18:14:06.551678: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 8: 2.64136, expected 1.97935
2024-06-21 18:14:06.551686: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 9: 3.61416, expected 2.95215
2024-06-21 18:14:06.558914: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:705] Results mismatch between different convolution algorithms. This is likely a bug/unexpected loss of precision in cudnn.
(f32[2,32,510,510]{3,2,1,0}, u8[0]{0}) custom-call(f32[2,1,512,512]{3,2,1,0}, f32[32,1,3,3]{3,2,1,0}, f32[32]{0}), window={size=3x3}, dim_labels=bf01_oi01->bf01, custom_call_target="__cudnn$convBiasActivationForward", backend_config={"conv_result_scale":1,"activation_mode":"kRelu","side_input_scale":0,"leakyrelu_alpha":0} for eng20{k2=2,k4=1,k5=1,k6=0,k7=0} vs eng15{k5=1,k6=0,k7=1,k10=1}
2024-06-21 18:14:06.558972: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:270] Device: Tesla P100-PCIE-16GB
2024-06-21 18:14:06.558981: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:271] Platform: Compute Capability 6.0
2024-06-21 18:14:06.558989: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:272] Driver: 12020 (535.129.3)
2024-06-21 18:14:06.558996: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:273] Runtime: <undefined>
2024-06-21 18:14:06.559013: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:280] cudnn version: 8.9.0
2024-06-21 18:14:06.756032: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 0: 3.03991, expected 2.3779
2024-06-21 18:14:06.756082: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 1: 2.84086, expected 2.17885
2024-06-21 18:14:06.756092: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 2: 2.82741, expected 2.1654
2024-06-21 18:14:06.756099: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 3: 2.87941, expected 2.2174
2024-06-21 18:14:06.756107: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 4: 3.6931, expected 3.03109
2024-06-21 18:14:06.756115: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 5: 3.9916, expected 3.32959
2024-06-21 18:14:06.756122: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 6: 3.02855, expected 2.36654
2024-06-21 18:14:06.756130: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 7: 3.01353, expected 2.35152
2024-06-21 18:14:06.756138: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 8: 2.64136, expected 1.97935
2024-06-21 18:14:06.756146: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 9: 3.61416, expected 2.95215
2024-06-21 18:14:06.763158: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:705] Results mismatch between different convolution algorithms. This is likely a bug/unexpected loss of precision in cudnn.
(f32[2,32,510,510]{3,2,1,0}, u8[0]{0}) custom-call(f32[2,1,512,512]{3,2,1,0}, f32[32,1,3,3]{3,2,1,0}, f32[32]{0}), window={size=3x3}, dim_labels=bf01_oi01->bf01, custom_call_target="__cudnn$convBiasActivationForward", backend_config={"conv_result_scale":1,"activation_mode":"kRelu","side_input_scale":0,"leakyrelu_alpha":0} for eng20{k2=2,k4=1,k5=1,k6=0,k7=0} vs eng15{k5=1,k6=0,k7=1,k10=1}
2024-06-21 18:14:06.763198: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:270] Device: Tesla P100-PCIE-16GB
2024-06-21 18:14:06.763207: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:271] Platform: Compute Capability 6.0
2024-06-21 18:14:06.763213: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:272] Driver: 12020 (535.129.3)
2024-06-21 18:14:06.763220: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:273] Runtime: <undefined>
2024-06-21 18:14:06.763235: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:280] cudnn version: 8.9.0
1119/5441 ━━━━━━━━━━━━━━━━━━━━ 5:05 71ms/step - accuracy: 0.6546 - loss: 2.7822
W0000 00:00:1718993650.359440     104 graph_launch.cc:671] Fallback to op-by-op mode because memset node breaks graph update
5441/5441 ━━━━━━━━━━━━━━━━━━━━ 0s 65ms/step - accuracy: 0.6758 - loss: 1.2091
W0000 00:00:1718993926.465461     103 graph_launch.cc:671] Fallback to op-by-op mode because memset node breaks graph update
2024-06-21 18:19:20.395446: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 4199: 0.978102, expected 0.772126
2024-06-21 18:19:20.395513: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 5049: 0.932037, expected 0.726061
2024-06-21 18:19:20.395529: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 6215: 0.803031, expected 0.597055
2024-06-21 18:19:20.395539: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 6725: 0.865835, expected 0.659859
2024-06-21 18:19:20.395552: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 6961: 1.04407, expected 0.838099
2024-06-21 18:19:20.395563: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 7549: 1.04724, expected 0.841261
2024-06-21 18:19:20.395574: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 8482: 1.04702, expected 0.841045
2024-06-21 18:19:20.395605: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 14230: 0.978102, expected 0.772126
2024-06-21 18:19:20.395616: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 15078: 0.932037, expected 0.726061
2024-06-21 18:19:20.395628: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 16246: 0.803031, expected 0.597055
2024-06-21 18:19:20.406606: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:705] Results mismatch between different convolution algorithms. This is likely a bug/unexpected loss of precision in cudnn.
(f32[3,32,510,510]{3,2,1,0}, u8[0]{0}) custom-call(f32[3,1,512,512]{3,2,1,0}, f32[32,1,3,3]{3,2,1,0}, f32[32]{0}), window={size=3x3}, dim_labels=bf01_oi01->bf01, custom_call_target="__cudnn$convBiasActivationForward", backend_config={"conv_result_scale":1,"activation_mode":"kRelu","side_input_scale":0,"leakyrelu_alpha":0} for eng20{k2=2,k4=1,k5=1,k6=0,k7=0} vs eng15{k5=1,k6=0,k7=1,k10=1}
2024-06-21 18:19:20.406637: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:270] Device: Tesla P100-PCIE-16GB
2024-06-21 18:19:20.406646: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:271] Platform: Compute Capability 6.0
2024-06-21 18:19:20.406652: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:272] Driver: 12020 (535.129.3)
2024-06-21 18:19:20.406659: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:273] Runtime: <undefined>
2024-06-21 18:19:20.406678: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:280] cudnn version: 8.9.0
2024-06-21 18:19:20.664251: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 4199: 0.978102, expected 0.772126
2024-06-21 18:19:20.664310: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 5049: 0.932037, expected 0.726061
2024-06-21 18:19:20.664324: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 6215: 0.803031, expected 0.597055
2024-06-21 18:19:20.664334: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 6725: 0.865835, expected 0.659859
2024-06-21 18:19:20.664343: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 6961: 1.04407, expected 0.838099
2024-06-21 18:19:20.664354: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 7549: 1.04724, expected 0.841261
2024-06-21 18:19:20.664365: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 8482: 1.04702, expected 0.841045
2024-06-21 18:19:20.664397: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 14230: 0.978102, expected 0.772126
2024-06-21 18:19:20.664408: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 15078: 0.932037, expected 0.726061
2024-06-21 18:19:20.664421: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 16246: 0.803031, expected 0.597055
2024-06-21 18:19:20.675040: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:705] Results mismatch between different convolution algorithms. This is likely a bug/unexpected loss of precision in cudnn.
(f32[3,32,510,510]{3,2,1,0}, u8[0]{0}) custom-call(f32[3,1,512,512]{3,2,1,0}, f32[32,1,3,3]{3,2,1,0}, f32[32]{0}), window={size=3x3}, dim_labels=bf01_oi01->bf01, custom_call_target="__cudnn$convBiasActivationForward", backend_config={"conv_result_scale":1,"activation_mode":"kRelu","side_input_scale":0,"leakyrelu_alpha":0} for eng20{k2=2,k4=1,k5=1,k6=0,k7=0} vs eng15{k5=1,k6=0,k7=1,k10=1}
2024-06-21 18:19:20.675070: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:270] Device: Tesla P100-PCIE-16GB
2024-06-21 18:19:20.675078: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:271] Platform: Compute Capability 6.0
2024-06-21 18:19:20.675085: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:272] Driver: 12020 (535.129.3)
2024-06-21 18:19:20.675092: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:273] Runtime: <undefined>
2024-06-21 18:19:20.675107: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:280] cudnn version: 8.9.0
5441/5441 ━━━━━━━━━━━━━━━━━━━━ 401s 72ms/step - accuracy: 0.6758 - loss: 1.2090 - val_accuracy: 0.6763 - val_loss: 0.6297
Epoch 2/5
W0000 00:00:1718993962.318353     104 graph_launch.cc:671] Fallback to op-by-op mode because memset node breaks graph update
5441/5441 ━━━━━━━━━━━━━━━━━━━━ 291s 53ms/step - accuracy: 0.6829 - loss: 0.6248 - val_accuracy: 0.6763 - val_loss: 0.6299
Epoch 3/5
5441/5441 ━━━━━━━━━━━━━━━━━━━━ 291s 53ms/step - accuracy: 0.6855 - loss: 0.6228 - val_accuracy: 0.6763 - val_loss: 0.6297
Epoch 4/5
5441/5441 ━━━━━━━━━━━━━━━━━━━━ 292s 54ms/step - accuracy: 0.6867 - loss: 0.6219 - val_accuracy: 0.6763 - val_loss: 0.6298
Epoch 5/5
5441/5441 ━━━━━━━━━━━━━━━━━━━━ 292s 54ms/step - accuracy: 0.6848 - loss: 0.6233 - val_accuracy: 0.6763 - val_loss: 0.6298
<keras.src.callbacks.history.History at 0x79b37c052260>
5441/5441 ━━━━━━━━━━━━━━━━━━━━ 172s 32ms/step - accuracy: 0.6861 - loss: 0.6222
  1/605 ━━━━━━━━━━━━━━━━━━━━ 43s 71ms/step - accuracy: 0.5000 - loss: 0.7668
W0000 00:00:1718995300.584891     103 graph_launch.cc:671] Fallback to op-by-op mode because memset node breaks graph update
605/605 ━━━━━━━━━━━━━━━━━━━━ 19s 31ms/step - accuracy: 0.6785 - loss: 0.6281
1512/1512 ━━━━━━━━━━━━━━━━━━━━ 86s 57ms/step - accuracy: 0.6956 - loss: 0.6148
Empty DataFrame
Columns: [Model, Training Loss, Training Accuracy, Validation Loss, Validation Accuracy, Testing Loss, Testing Accuracy]
Index: []
Model Training Loss Training Accuracy Validation Loss Validation Accuracy Testing Loss Testing Accuracy
0 CNN Base Model 0.623253 0.684726 0.629796 0.676313 0.623872 0.683923

Above basic CNN model gives us a testing accuracy of about 68%. We will try to improve on this by increasing the number of layers and using some regularisation techniques to avoid overfitting

Regularising & Improving the Base CNN Model

Epoch 1/5
   3/5441 ━━━━━━━━━━━━━━━━━━━━ 4:08 46ms/step - accuracy: 0.5694 - loss: 9.9505   
W0000 00:00:1718995443.520817     106 graph_launch.cc:671] Fallback to op-by-op mode because memset node breaks graph update
2631/5441 ━━━━━━━━━━━━━━━━━━━━ 2:23 51ms/step - accuracy: 0.6574 - loss: 25.1266
W0000 00:00:1718995577.887873     106 graph_launch.cc:671] Fallback to op-by-op mode because memset node breaks graph update
5441/5441 ━━━━━━━━━━━━━━━━━━━━ 0s 47ms/step - accuracy: 0.6656 - loss: 16.6215
W0000 00:00:1718995702.773932     104 graph_launch.cc:671] Fallback to op-by-op mode because memset node breaks graph update
W0000 00:00:1718995722.654829     104 graph_launch.cc:671] Fallback to op-by-op mode because memset node breaks graph update
5441/5441 ━━━━━━━━━━━━━━━━━━━━ 308s 52ms/step - accuracy: 0.6656 - loss: 16.6197 - val_accuracy: 0.6755 - val_loss: 1.9177
Epoch 2/5
5441/5441 ━━━━━━━━━━━━━━━━━━━━ 267s 49ms/step - accuracy: 0.6747 - loss: 1.8139 - val_accuracy: 0.7255 - val_loss: 0.7162
Epoch 3/5
5441/5441 ━━━━━━━━━━━━━━━━━━━━ 268s 49ms/step - accuracy: 0.7276 - loss: 0.6608 - val_accuracy: 0.7404 - val_loss: 0.6033
Epoch 4/5
5441/5441 ━━━━━━━━━━━━━━━━━━━━ 270s 50ms/step - accuracy: 0.7527 - loss: 0.5861 - val_accuracy: 0.7528 - val_loss: 0.5957
Epoch 5/5
5441/5441 ━━━━━━━━━━━━━━━━━━━━ 269s 49ms/step - accuracy: 0.7605 - loss: 0.5853 - val_accuracy: 0.7751 - val_loss: 0.5505
5441/5441 ━━━━━━━━━━━━━━━━━━━━ 174s 32ms/step - accuracy: 0.7804 - loss: 0.5423
  1/605 ━━━━━━━━━━━━━━━━━━━━ 40s 67ms/step - accuracy: 0.5000 - loss: 1.0476
W0000 00:00:1718996976.088611     104 graph_launch.cc:671] Fallback to op-by-op mode because memset node breaks graph update
605/605 ━━━━━━━━━━━━━━━━━━━━ 20s 32ms/step - accuracy: 0.7722 - loss: 0.5605
1512/1512 ━━━━━━━━━━━━━━━━━━━━ 47s 31ms/step - accuracy: 0.7786 - loss: 0.5488
                   Model  Training Loss  Training Accuracy  Validation Loss  \
0         CNN Base Model       0.623253           0.684726         0.629796   
1  Regularized CNN Model       0.541767           0.780673         0.550529   

   Validation Accuracy  Testing Loss  Testing Accuracy  
0             0.676313      0.623872          0.683923  
1             0.775114      0.554473          0.773900  

The Regularised & Improved CNN Model has a testing accuracy of 77.39%. It doesn't overfit the data and the validation & testing losses being lower than the original model indcate that it does a good job of learning from the data.

We shall try to improve the model performance by employing Transfer Learning Methodologies next

Transfer Learning - VGG16

Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/vgg16/vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5
58889256/58889256 ━━━━━━━━━━━━━━━━━━━━ 3s 0us/step
Epoch 1/5
2024-06-21 19:10:57.263873: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 0: 3.26968, expected 2.80856
2024-06-21 19:10:57.263938: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 146: 3.17124, expected 2.71012
2024-06-21 19:10:57.263957: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 467: 3.35129, expected 2.89017
2024-06-21 19:10:57.264318: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 56320: 3.50732, expected 3.0462
2024-06-21 19:10:57.264486: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 73216: 3.24084, expected 2.77971
2024-06-21 19:10:57.264884: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 128512: 3.43029, expected 2.96917
2024-06-21 19:10:57.265068: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 148480: 3.5364, expected 3.07528
2024-06-21 19:10:57.265196: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 158720: 3.54893, expected 3.08781
2024-06-21 19:10:57.265769: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 262144: 4.39037, expected 3.72943
2024-06-21 19:10:57.265802: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 262148: 4.47068, expected 3.80974
2024-06-21 19:10:57.296075: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:705] Results mismatch between different convolution algorithms. This is likely a bug/unexpected loss of precision in cudnn.
(f32[4,64,512,512]{3,2,1,0}, u8[0]{0}) custom-call(f32[4,3,512,512]{3,2,1,0}, f32[64,3,3,3]{3,2,1,0}, f32[64]{0}), window={size=3x3 pad=1_1x1_1}, dim_labels=bf01_oi01->bf01, custom_call_target="__cudnn$convBiasActivationForward", backend_config={"conv_result_scale":1,"activation_mode":"kRelu","side_input_scale":0,"leakyrelu_alpha":0} for eng20{k2=1,k4=3,k5=1,k6=0,k7=0} vs eng15{k5=1,k6=0,k7=1,k10=1}
2024-06-21 19:10:57.296113: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:270] Device: Tesla P100-PCIE-16GB
2024-06-21 19:10:57.296127: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:271] Platform: Compute Capability 6.0
2024-06-21 19:10:57.296141: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:272] Driver: 12020 (535.129.3)
2024-06-21 19:10:57.296156: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:273] Runtime: <undefined>
2024-06-21 19:10:57.296178: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:280] cudnn version: 8.9.0
2024-06-21 19:10:58.007024: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 0: 3.26968, expected 2.80856
2024-06-21 19:10:58.007087: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 146: 3.17124, expected 2.71012
2024-06-21 19:10:58.007103: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 467: 3.35129, expected 2.89017
2024-06-21 19:10:58.007531: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 56320: 3.50732, expected 3.0462
2024-06-21 19:10:58.007684: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 73216: 3.24084, expected 2.77971
2024-06-21 19:10:58.008047: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 128512: 3.43029, expected 2.96917
2024-06-21 19:10:58.008236: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 148480: 3.5364, expected 3.07528
2024-06-21 19:10:58.008348: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 158720: 3.54893, expected 3.08781
2024-06-21 19:10:58.008934: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 262144: 4.39037, expected 3.72943
2024-06-21 19:10:58.008963: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 262148: 4.47068, expected 3.80974
2024-06-21 19:10:58.037855: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:705] Results mismatch between different convolution algorithms. This is likely a bug/unexpected loss of precision in cudnn.
(f32[4,64,512,512]{3,2,1,0}, u8[0]{0}) custom-call(f32[4,3,512,512]{3,2,1,0}, f32[64,3,3,3]{3,2,1,0}, f32[64]{0}), window={size=3x3 pad=1_1x1_1}, dim_labels=bf01_oi01->bf01, custom_call_target="__cudnn$convBiasActivationForward", backend_config={"conv_result_scale":1,"activation_mode":"kRelu","side_input_scale":0,"leakyrelu_alpha":0} for eng20{k2=1,k4=3,k5=1,k6=0,k7=0} vs eng15{k5=1,k6=0,k7=1,k10=1}
2024-06-21 19:10:58.037889: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:270] Device: Tesla P100-PCIE-16GB
2024-06-21 19:10:58.037903: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:271] Platform: Compute Capability 6.0
2024-06-21 19:10:58.037916: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:272] Driver: 12020 (535.129.3)
2024-06-21 19:10:58.037930: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:273] Runtime: <undefined>
2024-06-21 19:10:58.037951: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:280] cudnn version: 8.9.0
   1/5441 ━━━━━━━━━━━━━━━━━━━━ 51:04:33 34s/step - accuracy: 0.7500 - loss: 1425.9535
W0000 00:00:1718997084.868118     106 graph_launch.cc:671] Fallback to op-by-op mode because memset node breaks graph update
1413/5441 ━━━━━━━━━━━━━━━━━━━━ 10:36 158ms/step - accuracy: 0.6520 - loss: 48709.1562
2024-06-21 19:15:09.585483: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 0: 4.96943, expected 4.13831
2024-06-21 19:15:09.585538: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 1: 6.82185, expected 5.99072
2024-06-21 19:15:09.585547: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 2: 6.62315, expected 5.79201
2024-06-21 19:15:09.585555: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 3: 6.80174, expected 5.97062
2024-06-21 19:15:09.585563: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 4: 5.5794, expected 4.74827
2024-06-21 19:15:09.585570: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 5: 7.04486, expected 6.21373
2024-06-21 19:15:09.585578: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 6: 6.86165, expected 6.03052
2024-06-21 19:15:09.585585: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 7: 5.79562, expected 4.96449
2024-06-21 19:15:09.585593: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 8: 5.67027, expected 4.83914
2024-06-21 19:15:09.585600: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 9: 6.19701, expected 5.36588
2024-06-21 19:15:09.600858: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:705] Results mismatch between different convolution algorithms. This is likely a bug/unexpected loss of precision in cudnn.
(f32[2,64,512,512]{3,2,1,0}, u8[0]{0}) custom-call(f32[2,3,512,512]{3,2,1,0}, f32[64,3,3,3]{3,2,1,0}, f32[64]{0}), window={size=3x3 pad=1_1x1_1}, dim_labels=bf01_oi01->bf01, custom_call_target="__cudnn$convBiasActivationForward", backend_config={"conv_result_scale":1,"activation_mode":"kRelu","side_input_scale":0,"leakyrelu_alpha":0} for eng20{k2=1,k4=1,k5=1,k6=0,k7=0} vs eng15{k5=1,k6=0,k7=1,k10=1}
2024-06-21 19:15:09.600889: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:270] Device: Tesla P100-PCIE-16GB
2024-06-21 19:15:09.600897: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:271] Platform: Compute Capability 6.0
2024-06-21 19:15:09.600904: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:272] Driver: 12020 (535.129.3)
2024-06-21 19:15:09.600911: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:273] Runtime: <undefined>
2024-06-21 19:15:09.600926: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:280] cudnn version: 8.9.0
2024-06-21 19:15:09.945610: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 0: 4.96943, expected 4.13831
2024-06-21 19:15:09.945668: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 1: 6.82185, expected 5.99072
2024-06-21 19:15:09.945677: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 2: 6.62315, expected 5.79201
2024-06-21 19:15:09.945685: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 3: 6.80174, expected 5.97062
2024-06-21 19:15:09.945692: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 4: 5.5794, expected 4.74827
2024-06-21 19:15:09.945700: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 5: 7.04486, expected 6.21373
2024-06-21 19:15:09.945707: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 6: 6.86165, expected 6.03052
2024-06-21 19:15:09.945714: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 7: 5.79562, expected 4.96449
2024-06-21 19:15:09.945722: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 8: 5.67027, expected 4.83914
2024-06-21 19:15:09.945729: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 9: 6.19701, expected 5.36588
2024-06-21 19:15:09.960907: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:705] Results mismatch between different convolution algorithms. This is likely a bug/unexpected loss of precision in cudnn.
(f32[2,64,512,512]{3,2,1,0}, u8[0]{0}) custom-call(f32[2,3,512,512]{3,2,1,0}, f32[64,3,3,3]{3,2,1,0}, f32[64]{0}), window={size=3x3 pad=1_1x1_1}, dim_labels=bf01_oi01->bf01, custom_call_target="__cudnn$convBiasActivationForward", backend_config={"conv_result_scale":1,"activation_mode":"kRelu","side_input_scale":0,"leakyrelu_alpha":0} for eng20{k2=1,k4=1,k5=1,k6=0,k7=0} vs eng15{k5=1,k6=0,k7=1,k10=1}
2024-06-21 19:15:09.960937: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:270] Device: Tesla P100-PCIE-16GB
2024-06-21 19:15:09.960945: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:271] Platform: Compute Capability 6.0
2024-06-21 19:15:09.960952: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:272] Driver: 12020 (535.129.3)
2024-06-21 19:15:09.960958: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:273] Runtime: <undefined>
2024-06-21 19:15:09.960973: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:280] cudnn version: 8.9.0
1414/5441 ━━━━━━━━━━━━━━━━━━━━ 11:27 171ms/step - accuracy: 0.6520 - loss: 48683.7305
W0000 00:00:1718997326.094125     103 graph_launch.cc:671] Fallback to op-by-op mode because memset node breaks graph update
5441/5441 ━━━━━━━━━━━━━━━━━━━━ 0s 161ms/step - accuracy: 0.6733 - loss: 17118.5215
W0000 00:00:1718997964.376797     106 graph_launch.cc:671] Fallback to op-by-op mode because memset node breaks graph update
2024-06-21 19:26:35.126184: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 0: 4.58381, expected 3.71248
2024-06-21 19:26:35.126275: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 1: 5.57184, expected 4.70052
2024-06-21 19:26:35.126290: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 2: 5.08221, expected 4.21089
2024-06-21 19:26:35.126299: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 3: 5.21032, expected 4.339
2024-06-21 19:26:35.126308: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 4: 4.77947, expected 3.90814
2024-06-21 19:26:35.126327: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 5: 5.2979, expected 4.42658
2024-06-21 19:26:35.126334: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 6: 5.65663, expected 4.78531
2024-06-21 19:26:35.126342: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 7: 5.38052, expected 4.5092
2024-06-21 19:26:35.126349: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 8: 5.22912, expected 4.35779
2024-06-21 19:26:35.126357: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 9: 5.86884, expected 4.99752
2024-06-21 19:26:35.149637: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:705] Results mismatch between different convolution algorithms. This is likely a bug/unexpected loss of precision in cudnn.
(f32[3,64,512,512]{3,2,1,0}, u8[0]{0}) custom-call(f32[3,3,512,512]{3,2,1,0}, f32[64,3,3,3]{3,2,1,0}, f32[64]{0}), window={size=3x3 pad=1_1x1_1}, dim_labels=bf01_oi01->bf01, custom_call_target="__cudnn$convBiasActivationForward", backend_config={"conv_result_scale":1,"activation_mode":"kRelu","side_input_scale":0,"leakyrelu_alpha":0} for eng20{k2=1,k4=3,k5=1,k6=0,k7=0} vs eng15{k5=1,k6=0,k7=1,k10=1}
2024-06-21 19:26:35.149692: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:270] Device: Tesla P100-PCIE-16GB
2024-06-21 19:26:35.149701: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:271] Platform: Compute Capability 6.0
2024-06-21 19:26:35.149707: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:272] Driver: 12020 (535.129.3)
2024-06-21 19:26:35.149714: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:273] Runtime: <undefined>
2024-06-21 19:26:35.149730: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:280] cudnn version: 8.9.0
2024-06-21 19:26:35.647941: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 0: 4.58381, expected 3.71248
2024-06-21 19:26:35.647998: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 1: 5.57184, expected 4.70052
2024-06-21 19:26:35.648008: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 2: 5.08221, expected 4.21089
2024-06-21 19:26:35.648016: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 3: 5.21032, expected 4.339
2024-06-21 19:26:35.648023: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 4: 4.77947, expected 3.90814
2024-06-21 19:26:35.648031: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 5: 5.2979, expected 4.42658
2024-06-21 19:26:35.648038: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 6: 5.65663, expected 4.78531
2024-06-21 19:26:35.648046: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 7: 5.38052, expected 4.5092
2024-06-21 19:26:35.648053: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 8: 5.22912, expected 4.35779
2024-06-21 19:26:35.648061: E external/local_xla/xla/service/gpu/buffer_comparator.cc:1137] Difference at 9: 5.86884, expected 4.99752
2024-06-21 19:26:35.671200: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:705] Results mismatch between different convolution algorithms. This is likely a bug/unexpected loss of precision in cudnn.
(f32[3,64,512,512]{3,2,1,0}, u8[0]{0}) custom-call(f32[3,3,512,512]{3,2,1,0}, f32[64,3,3,3]{3,2,1,0}, f32[64]{0}), window={size=3x3 pad=1_1x1_1}, dim_labels=bf01_oi01->bf01, custom_call_target="__cudnn$convBiasActivationForward", backend_config={"conv_result_scale":1,"activation_mode":"kRelu","side_input_scale":0,"leakyrelu_alpha":0} for eng20{k2=1,k4=3,k5=1,k6=0,k7=0} vs eng15{k5=1,k6=0,k7=1,k10=1}
2024-06-21 19:26:35.671242: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:270] Device: Tesla P100-PCIE-16GB
2024-06-21 19:26:35.671250: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:271] Platform: Compute Capability 6.0
2024-06-21 19:26:35.671257: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:272] Driver: 12020 (535.129.3)
2024-06-21 19:26:35.671263: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:273] Runtime: <undefined>
2024-06-21 19:26:35.671278: E external/local_xla/xla/service/gpu/conv_algorithm_picker.cc:280] cudnn version: 8.9.0
W0000 00:00:1718998003.188011     103 graph_launch.cc:671] Fallback to op-by-op mode because memset node breaks graph update
5441/5441 ━━━━━━━━━━━━━━━━━━━━ 956s 170ms/step - accuracy: 0.6733 - loss: 17115.9844 - val_accuracy: 0.6763 - val_loss: 0.6544
Epoch 2/5
5441/5441 ━━━━━━━━━━━━━━━━━━━━ 896s 165ms/step - accuracy: 0.6819 - loss: 0.6472 - val_accuracy: 0.6763 - val_loss: 0.6377
Epoch 3/5
5441/5441 ━━━━━━━━━━━━━━━━━━━━ 896s 165ms/step - accuracy: 0.6882 - loss: 2.3013 - val_accuracy: 0.6763 - val_loss: 0.6320
Epoch 4/5
5441/5441 ━━━━━━━━━━━━━━━━━━━━ 896s 165ms/step - accuracy: 0.6872 - loss: 0.6247 - val_accuracy: 0.6763 - val_loss: 0.6302
Epoch 5/5
5441/5441 ━━━━━━━━━━━━━━━━━━━━ 896s 165ms/step - accuracy: 0.6836 - loss: 0.6251 - val_accuracy: 0.6763 - val_loss: 0.6297
5441/5441 ━━━━━━━━━━━━━━━━━━━━ 285s 52ms/step - accuracy: 0.6851 - loss: 0.6235
W0000 00:00:1719001877.268323     104 graph_launch.cc:671] Fallback to op-by-op mode because memset node breaks graph update
605/605 ━━━━━━━━━━━━━━━━━━━━ 32s 52ms/step - accuracy: 0.6785 - loss: 0.6282
1512/1512 ━━━━━━━━━━━━━━━━━━━━ 80s 53ms/step - accuracy: 0.6956 - loss: 0.6160
                   Model  Training Loss  Training Accuracy  Validation Loss  \
0         CNN Base Model       0.623253           0.684726         0.629796   
1  Regularized CNN Model       0.541767           0.780673         0.550529   
2            VGG16 Model       0.623735           0.684726         0.629697   

   Validation Accuracy  Testing Loss  Testing Accuracy  
0             0.676313      0.623872          0.683923  
1             0.775114      0.554473          0.773900  
2             0.676313      0.624291          0.683923  

The VGG16 Transfer Learning model does not outperform the Improved and Regularised CNN Model as far as the accuracy is concerned. We shall try the ResNet50 Transfer Learning model next.

Transfer Learning - ResNet50 Model

Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/resnet/resnet50_weights_tf_dim_ordering_tf_kernels_notop.h5
94765736/94765736 ━━━━━━━━━━━━━━━━━━━━ 1s 0us/step
Epoch 1/5
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1719035506.888833      77 device_compiler.h:186] Compiled cluster using XLA!  This line is logged at most once for the lifetime of the process.
W0000 00:00:1719035506.985395      77 graph_launch.cc:671] Fallback to op-by-op mode because memset node breaks graph update
3742/5441 ━━━━━━━━━━━━━━━━━━━━ 4:17 152ms/step - accuracy: 0.6930 - loss: 1.2237
W0000 00:00:1719036075.099482      77 graph_launch.cc:671] Fallback to op-by-op mode because memset node breaks graph update
5441/5441 ━━━━━━━━━━━━━━━━━━━━ 0s 149ms/step - accuracy: 0.7086 - loss: 1.0445
W0000 00:00:1719036323.091664      76 graph_launch.cc:671] Fallback to op-by-op mode because memset node breaks graph update
W0000 00:00:1719036357.246286      79 graph_launch.cc:671] Fallback to op-by-op mode because memset node breaks graph update
5441/5441 ━━━━━━━━━━━━━━━━━━━━ 944s 160ms/step - accuracy: 0.7086 - loss: 1.0445 - val_accuracy: 0.7495 - val_loss: 0.5590
Epoch 2/5
5441/5441 ━━━━━━━━━━━━━━━━━━━━ 829s 152ms/step - accuracy: 0.7888 - loss: 0.4741 - val_accuracy: 0.8078 - val_loss: 0.4481
Epoch 3/5
5441/5441 ━━━━━━━━━━━━━━━━━━━━ 812s 149ms/step - accuracy: 0.8129 - loss: 0.4195 - val_accuracy: 0.7664 - val_loss: 1.3505
Epoch 4/5
5441/5441 ━━━━━━━━━━━━━━━━━━━━ 814s 150ms/step - accuracy: 0.8273 - loss: 0.3990 - val_accuracy: 0.8074 - val_loss: 0.4699
Epoch 5/5
5441/5441 ━━━━━━━━━━━━━━━━━━━━ 815s 150ms/step - accuracy: 0.8406 - loss: 0.3659 - val_accuracy: 0.7350 - val_loss: 0.4855
5441/5441 ━━━━━━━━━━━━━━━━━━━━ 223s 41ms/step - accuracy: 0.8164 - loss: 0.4374
605/605 ━━━━━━━━━━━━━━━━━━━━ 25s 41ms/step - accuracy: 0.7897 - loss: 0.4581
1512/1512 ━━━━━━━━━━━━━━━━━━━━ 61s 40ms/step - accuracy: 0.8120 - loss: 0.4445
Model Training Loss Training Accuracy Validation Loss Validation Accuracy Testing Loss Testing Accuracy
0 CNN Base Model 0.623253 0.684726 0.629796 0.676313 0.623872 0.683923
1 Regularized CNN Model 0.541767 0.780673 0.550529 0.775114 0.554473 0.773900
2 VGG16 Model 0.623735 0.684726 0.629697 0.676313 0.624291 0.683923
3 ResNet50 Model 0.434548 0.817572 0.448078 0.807772 0.447309 0.810288

ResNet50 does the best job so far among all the models in identifying Pneumonia markers among the images with an accuracy of 81.02% on the testing dataset. We make a recommendation of this model for identifying pneumonia images since it outperforms the rest

Below are other some additional performance metrics for the ResNet50 Tranfer Learning Model:

W0000 00:00:1719046308.564109      79 graph_launch.cc:671] Fallback to op-by-op mode because memset node breaks graph update
W0000 00:00:1719046392.379081      79 graph_launch.cc:671] Fallback to op-by-op mode because memset node breaks graph update
Train Precision: 0.6894247149783777, Train Recall: 0.7667978428800466
Validation Precision: 0.6939024390243902, Validation Recall: 0.7266922094508301
Test Precision: 0.6793427230046948, Test Recall: 0.7571951857666144
No description has been provided for this image
  • Testing precision is at 68% and the testing recall is fairly good at 75.71%
  • A higher recall is essential in medical diagnosis exercises as missing a positive case can be life threatening. As such our model does a good job of identifying about 75% of the actual positive cases
  • An additional point to note here is that we have trained the ResNet50 Model for 5 epochs due to computational contraints. The performance of the model would definitely improve further if it were trained for longer.

Object Detection with Mask R-CNN¶

fatal: destination path 'Mask_RCNN' already exists and is not an empty directory.
Using TensorFlow backend.
Configurations:
BACKBONE                       resnet50
BACKBONE_STRIDES               [4, 8, 16, 32, 64]
BATCH_SIZE                     8
BBOX_STD_DEV                   [0.1 0.1 0.2 0.2]
COMPUTE_BACKBONE_SHAPE         None
DETECTION_MAX_INSTANCES        3
DETECTION_MIN_CONFIDENCE       0.7
DETECTION_NMS_THRESHOLD        0.1
FPN_CLASSIF_FC_LAYERS_SIZE     1024
GPU_COUNT                      1
GRADIENT_CLIP_NORM             5.0
IMAGES_PER_GPU                 8
IMAGE_CHANNEL_COUNT            3
IMAGE_MAX_DIM                  256
IMAGE_META_SIZE                14
IMAGE_MIN_DIM                  256
IMAGE_MIN_SCALE                0
IMAGE_RESIZE_MODE              square
IMAGE_SHAPE                    [256 256   3]
LEARNING_MOMENTUM              0.9
LEARNING_RATE                  0.001
LOSS_WEIGHTS                   {'rpn_class_loss': 1.0, 'rpn_bbox_loss': 1.0, 'mrcnn_class_loss': 1.0, 'mrcnn_bbox_loss': 1.0, 'mrcnn_mask_loss': 1.0}
MASK_POOL_SIZE                 14
MASK_SHAPE                     [28, 28]
MAX_GT_INSTANCES               3
MEAN_PIXEL                     [123.7 116.8 103.9]
MINI_MASK_SHAPE                (56, 56)
NAME                           pneumonia
NUM_CLASSES                    2
POOL_SIZE                      7
POST_NMS_ROIS_INFERENCE        1000
POST_NMS_ROIS_TRAINING         2000
PRE_NMS_LIMIT                  6000
ROI_POSITIVE_RATIO             0.33
RPN_ANCHOR_RATIOS              [0.5, 1, 2]
RPN_ANCHOR_SCALES              (32, 64, 128, 256)
RPN_ANCHOR_STRIDE              1
RPN_BBOX_STD_DEV               [0.1 0.1 0.2 0.2]
RPN_NMS_THRESHOLD              0.7
RPN_TRAIN_ANCHORS_PER_IMAGE    256
STEPS_PER_EPOCH                200
TOP_DOWN_PYRAMID_SIZE          256
TRAIN_BN                       False
TRAIN_ROIS_PER_IMAGE           32
USE_MINI_MASK                  True
USE_RPN_ROIS                   True
VALIDATION_STEPS               50
WEIGHT_DECAY                   0.0001


patientId x y width height Target
0 0004cfab-14fd-4e49-80ba-63a80b6bddd6 NaN NaN NaN NaN 0
1 00313ee0-9eaa-42f4-b0ab-c148ed3241cd NaN NaN NaN NaN 0
2 00322d4d-1c29-4943-afc9-b6754be640eb NaN NaN NaN NaN 0
3 003d8fa0-6bf1-40ed-b54c-ac657f8495c5 NaN NaN NaN NaN 0
4 00436515-870c-4b36-a041-de91049b9ab4 264.0 152.0 213.0 379.0 1
(0008, 0005) Specific Character Set              CS: 'ISO_IR 100'
(0008, 0016) SOP Class UID                       UI: Secondary Capture Image Storage
(0008, 0018) SOP Instance UID                    UI: 1.2.276.0.7230010.3.1.4.8323329.30954.1517874505.286453
(0008, 0020) Study Date                          DA: '19010101'
(0008, 0030) Study Time                          TM: '000000.00'
(0008, 0050) Accession Number                    SH: ''
(0008, 0060) Modality                            CS: 'CR'
(0008, 0064) Conversion Type                     CS: 'WSD'
(0008, 0090) Referring Physician's Name          PN: ''
(0008, 103e) Series Description                  LO: 'view: PA'
(0010, 0010) Patient's Name                      PN: 'a9d7fd57-d71d-4dc2-9556-418b18771f37'
(0010, 0020) Patient ID                          LO: 'a9d7fd57-d71d-4dc2-9556-418b18771f37'
(0010, 0030) Patient's Birth Date                DA: ''
(0010, 0040) Patient's Sex                       CS: 'F'
(0010, 1010) Patient's Age                       AS: '78'
(0018, 0015) Body Part Examined                  CS: 'CHEST'
(0018, 5101) View Position                       CS: 'PA'
(0020, 000d) Study Instance UID                  UI: 1.2.276.0.7230010.3.1.2.8323329.30954.1517874505.286452
(0020, 000e) Series Instance UID                 UI: 1.2.276.0.7230010.3.1.3.8323329.30954.1517874505.286451
(0020, 0010) Study ID                            SH: ''
(0020, 0011) Series Number                       IS: '1'
(0020, 0013) Instance Number                     IS: '1'
(0020, 0020) Patient Orientation                 CS: ''
(0028, 0002) Samples per Pixel                   US: 1
(0028, 0004) Photometric Interpretation          CS: 'MONOCHROME2'
(0028, 0010) Rows                                US: 1024
(0028, 0011) Columns                             US: 1024
(0028, 0030) Pixel Spacing                       DS: ['0.14300000000000002', '0.14300000000000002']
(0028, 0100) Bits Allocated                      US: 8
(0028, 0101) Bits Stored                         US: 8
(0028, 0102) High Bit                            US: 7
(0028, 0103) Pixel Representation                US: 0
(0028, 2110) Lossy Image Compression             CS: '01'
(0028, 2114) Lossy Image Compression Method      CS: 'ISO_10918_1'
(7fe0, 0010) Pixel Data                          OB: Array of 137282 bytes
25184 1500
[patientId    3860415d-c3b9-4ac6-ba32-51a6eddf6528
 x                                             NaN
 y                                             NaN
 width                                         NaN
 height                                        NaN
 Target                                          0
 Name: 3438, dtype: object]
(1024, 1024, 3)
/kaggle/input/stage_2_train_images/0c122fd9-6dc8-4224-baae-812ed5c4bd12.dcm
[1]
No description has been provided for this image
No description has been provided for this image
Starting at epoch 0. LR=0.012

Checkpoint Path: /kaggle/working/pneumonia20240622T0431/mask_rcnn_pneumonia_{epoch:04d}.h5
Selecting layers to train
conv1                  (Conv2D)
bn_conv1               (BatchNorm)
res2a_branch2a         (Conv2D)
bn2a_branch2a          (BatchNorm)
res2a_branch2b         (Conv2D)
bn2a_branch2b          (BatchNorm)
res2a_branch2c         (Conv2D)
res2a_branch1          (Conv2D)
bn2a_branch2c          (BatchNorm)
bn2a_branch1           (BatchNorm)
res2b_branch2a         (Conv2D)
bn2b_branch2a          (BatchNorm)
res2b_branch2b         (Conv2D)
bn2b_branch2b          (BatchNorm)
res2b_branch2c         (Conv2D)
bn2b_branch2c          (BatchNorm)
res2c_branch2a         (Conv2D)
bn2c_branch2a          (BatchNorm)
res2c_branch2b         (Conv2D)
bn2c_branch2b          (BatchNorm)
res2c_branch2c         (Conv2D)
bn2c_branch2c          (BatchNorm)
res3a_branch2a         (Conv2D)
bn3a_branch2a          (BatchNorm)
res3a_branch2b         (Conv2D)
bn3a_branch2b          (BatchNorm)
res3a_branch2c         (Conv2D)
res3a_branch1          (Conv2D)
bn3a_branch2c          (BatchNorm)
bn3a_branch1           (BatchNorm)
res3b_branch2a         (Conv2D)
bn3b_branch2a          (BatchNorm)
res3b_branch2b         (Conv2D)
bn3b_branch2b          (BatchNorm)
res3b_branch2c         (Conv2D)
bn3b_branch2c          (BatchNorm)
res3c_branch2a         (Conv2D)
bn3c_branch2a          (BatchNorm)
res3c_branch2b         (Conv2D)
bn3c_branch2b          (BatchNorm)
res3c_branch2c         (Conv2D)
bn3c_branch2c          (BatchNorm)
res3d_branch2a         (Conv2D)
bn3d_branch2a          (BatchNorm)
res3d_branch2b         (Conv2D)
bn3d_branch2b          (BatchNorm)
res3d_branch2c         (Conv2D)
bn3d_branch2c          (BatchNorm)
res4a_branch2a         (Conv2D)
bn4a_branch2a          (BatchNorm)
res4a_branch2b         (Conv2D)
bn4a_branch2b          (BatchNorm)
res4a_branch2c         (Conv2D)
res4a_branch1          (Conv2D)
bn4a_branch2c          (BatchNorm)
bn4a_branch1           (BatchNorm)
res4b_branch2a         (Conv2D)
bn4b_branch2a          (BatchNorm)
res4b_branch2b         (Conv2D)
bn4b_branch2b          (BatchNorm)
res4b_branch2c         (Conv2D)
bn4b_branch2c          (BatchNorm)
res4c_branch2a         (Conv2D)
bn4c_branch2a          (BatchNorm)
res4c_branch2b         (Conv2D)
bn4c_branch2b          (BatchNorm)
res4c_branch2c         (Conv2D)
bn4c_branch2c          (BatchNorm)
res4d_branch2a         (Conv2D)
bn4d_branch2a          (BatchNorm)
res4d_branch2b         (Conv2D)
bn4d_branch2b          (BatchNorm)
res4d_branch2c         (Conv2D)
bn4d_branch2c          (BatchNorm)
res4e_branch2a         (Conv2D)
bn4e_branch2a          (BatchNorm)
res4e_branch2b         (Conv2D)
bn4e_branch2b          (BatchNorm)
res4e_branch2c         (Conv2D)
bn4e_branch2c          (BatchNorm)
res4f_branch2a         (Conv2D)
bn4f_branch2a          (BatchNorm)
res4f_branch2b         (Conv2D)
bn4f_branch2b          (BatchNorm)
res4f_branch2c         (Conv2D)
bn4f_branch2c          (BatchNorm)
res5a_branch2a         (Conv2D)
bn5a_branch2a          (BatchNorm)
res5a_branch2b         (Conv2D)
bn5a_branch2b          (BatchNorm)
res5a_branch2c         (Conv2D)
res5a_branch1          (Conv2D)
bn5a_branch2c          (BatchNorm)
bn5a_branch1           (BatchNorm)
res5b_branch2a         (Conv2D)
bn5b_branch2a          (BatchNorm)
res5b_branch2b         (Conv2D)
bn5b_branch2b          (BatchNorm)
res5b_branch2c         (Conv2D)
bn5b_branch2c          (BatchNorm)
res5c_branch2a         (Conv2D)
bn5c_branch2a          (BatchNorm)
res5c_branch2b         (Conv2D)
bn5c_branch2b          (BatchNorm)
res5c_branch2c         (Conv2D)
bn5c_branch2c          (BatchNorm)
fpn_c5p5               (Conv2D)
fpn_c4p4               (Conv2D)
fpn_c3p3               (Conv2D)
fpn_c2p2               (Conv2D)
fpn_p5                 (Conv2D)
fpn_p2                 (Conv2D)
fpn_p3                 (Conv2D)
fpn_p4                 (Conv2D)
In model:  rpn_model
    rpn_conv_shared        (Conv2D)
    rpn_class_raw          (Conv2D)
    rpn_bbox_pred          (Conv2D)
mrcnn_mask_conv1       (TimeDistributed)
mrcnn_mask_bn1         (TimeDistributed)
mrcnn_mask_conv2       (TimeDistributed)
mrcnn_mask_bn2         (TimeDistributed)
mrcnn_class_conv1      (TimeDistributed)
mrcnn_class_bn1        (TimeDistributed)
mrcnn_mask_conv3       (TimeDistributed)
mrcnn_mask_bn3         (TimeDistributed)
mrcnn_class_conv2      (TimeDistributed)
mrcnn_class_bn2        (TimeDistributed)
mrcnn_mask_conv4       (TimeDistributed)
mrcnn_mask_bn4         (TimeDistributed)
mrcnn_bbox_fc          (TimeDistributed)
mrcnn_mask_deconv      (TimeDistributed)
mrcnn_class_logits     (TimeDistributed)
mrcnn_mask             (TimeDistributed)
Epoch 1/2
200/200 [==============================] - 579s 3s/step - loss: 2.6424 - rpn_class_loss: 0.1252 - rpn_bbox_loss: 0.7570 - mrcnn_class_loss: 0.4976 - mrcnn_bbox_loss: 0.7291 - mrcnn_mask_loss: 0.5335 - val_loss: 2.0305 - val_rpn_class_loss: 0.0858 - val_rpn_bbox_loss: 0.4525 - val_mrcnn_class_loss: 0.4236 - val_mrcnn_bbox_loss: 0.5873 - val_mrcnn_mask_loss: 0.4813
Epoch 2/2
200/200 [==============================] - 272s 1s/step - loss: 2.0260 - rpn_class_loss: 0.0780 - rpn_bbox_loss: 0.4762 - mrcnn_class_loss: 0.4035 - mrcnn_bbox_loss: 0.6046 - mrcnn_mask_loss: 0.4637 - val_loss: 2.0225 - val_rpn_class_loss: 0.0696 - val_rpn_bbox_loss: 0.4193 - val_mrcnn_class_loss: 0.4067 - val_mrcnn_bbox_loss: 0.6661 - val_mrcnn_mask_loss: 0.4608
CPU times: user 9min 5s, sys: 48.3 s, total: 9min 54s
Wall time: 17min 50s
Starting at epoch 2. LR=0.006

Checkpoint Path: /kaggle/working/pneumonia20240622T0431/mask_rcnn_pneumonia_{epoch:04d}.h5
Selecting layers to train
conv1                  (Conv2D)
bn_conv1               (BatchNorm)
res2a_branch2a         (Conv2D)
bn2a_branch2a          (BatchNorm)
res2a_branch2b         (Conv2D)
bn2a_branch2b          (BatchNorm)
res2a_branch2c         (Conv2D)
res2a_branch1          (Conv2D)
bn2a_branch2c          (BatchNorm)
bn2a_branch1           (BatchNorm)
res2b_branch2a         (Conv2D)
bn2b_branch2a          (BatchNorm)
res2b_branch2b         (Conv2D)
bn2b_branch2b          (BatchNorm)
res2b_branch2c         (Conv2D)
bn2b_branch2c          (BatchNorm)
res2c_branch2a         (Conv2D)
bn2c_branch2a          (BatchNorm)
res2c_branch2b         (Conv2D)
bn2c_branch2b          (BatchNorm)
res2c_branch2c         (Conv2D)
bn2c_branch2c          (BatchNorm)
res3a_branch2a         (Conv2D)
bn3a_branch2a          (BatchNorm)
res3a_branch2b         (Conv2D)
bn3a_branch2b          (BatchNorm)
res3a_branch2c         (Conv2D)
res3a_branch1          (Conv2D)
bn3a_branch2c          (BatchNorm)
bn3a_branch1           (BatchNorm)
res3b_branch2a         (Conv2D)
bn3b_branch2a          (BatchNorm)
res3b_branch2b         (Conv2D)
bn3b_branch2b          (BatchNorm)
res3b_branch2c         (Conv2D)
bn3b_branch2c          (BatchNorm)
res3c_branch2a         (Conv2D)
bn3c_branch2a          (BatchNorm)
res3c_branch2b         (Conv2D)
bn3c_branch2b          (BatchNorm)
res3c_branch2c         (Conv2D)
bn3c_branch2c          (BatchNorm)
res3d_branch2a         (Conv2D)
bn3d_branch2a          (BatchNorm)
res3d_branch2b         (Conv2D)
bn3d_branch2b          (BatchNorm)
res3d_branch2c         (Conv2D)
bn3d_branch2c          (BatchNorm)
res4a_branch2a         (Conv2D)
bn4a_branch2a          (BatchNorm)
res4a_branch2b         (Conv2D)
bn4a_branch2b          (BatchNorm)
res4a_branch2c         (Conv2D)
res4a_branch1          (Conv2D)
bn4a_branch2c          (BatchNorm)
bn4a_branch1           (BatchNorm)
res4b_branch2a         (Conv2D)
bn4b_branch2a          (BatchNorm)
res4b_branch2b         (Conv2D)
bn4b_branch2b          (BatchNorm)
res4b_branch2c         (Conv2D)
bn4b_branch2c          (BatchNorm)
res4c_branch2a         (Conv2D)
bn4c_branch2a          (BatchNorm)
res4c_branch2b         (Conv2D)
bn4c_branch2b          (BatchNorm)
res4c_branch2c         (Conv2D)
bn4c_branch2c          (BatchNorm)
res4d_branch2a         (Conv2D)
bn4d_branch2a          (BatchNorm)
res4d_branch2b         (Conv2D)
bn4d_branch2b          (BatchNorm)
res4d_branch2c         (Conv2D)
bn4d_branch2c          (BatchNorm)
res4e_branch2a         (Conv2D)
bn4e_branch2a          (BatchNorm)
res4e_branch2b         (Conv2D)
bn4e_branch2b          (BatchNorm)
res4e_branch2c         (Conv2D)
bn4e_branch2c          (BatchNorm)
res4f_branch2a         (Conv2D)
bn4f_branch2a          (BatchNorm)
res4f_branch2b         (Conv2D)
bn4f_branch2b          (BatchNorm)
res4f_branch2c         (Conv2D)
bn4f_branch2c          (BatchNorm)
res5a_branch2a         (Conv2D)
bn5a_branch2a          (BatchNorm)
res5a_branch2b         (Conv2D)
bn5a_branch2b          (BatchNorm)
res5a_branch2c         (Conv2D)
res5a_branch1          (Conv2D)
bn5a_branch2c          (BatchNorm)
bn5a_branch1           (BatchNorm)
res5b_branch2a         (Conv2D)
bn5b_branch2a          (BatchNorm)
res5b_branch2b         (Conv2D)
bn5b_branch2b          (BatchNorm)
res5b_branch2c         (Conv2D)
bn5b_branch2c          (BatchNorm)
res5c_branch2a         (Conv2D)
bn5c_branch2a          (BatchNorm)
res5c_branch2b         (Conv2D)
bn5c_branch2b          (BatchNorm)
res5c_branch2c         (Conv2D)
bn5c_branch2c          (BatchNorm)
fpn_c5p5               (Conv2D)
fpn_c4p4               (Conv2D)
fpn_c3p3               (Conv2D)
fpn_c2p2               (Conv2D)
fpn_p5                 (Conv2D)
fpn_p2                 (Conv2D)
fpn_p3                 (Conv2D)
fpn_p4                 (Conv2D)
In model:  rpn_model
    rpn_conv_shared        (Conv2D)
    rpn_class_raw          (Conv2D)
    rpn_bbox_pred          (Conv2D)
mrcnn_mask_conv1       (TimeDistributed)
mrcnn_mask_bn1         (TimeDistributed)
mrcnn_mask_conv2       (TimeDistributed)
mrcnn_mask_bn2         (TimeDistributed)
mrcnn_class_conv1      (TimeDistributed)
mrcnn_class_bn1        (TimeDistributed)
mrcnn_mask_conv3       (TimeDistributed)
mrcnn_mask_bn3         (TimeDistributed)
mrcnn_class_conv2      (TimeDistributed)
mrcnn_class_bn2        (TimeDistributed)
mrcnn_mask_conv4       (TimeDistributed)
mrcnn_mask_bn4         (TimeDistributed)
mrcnn_bbox_fc          (TimeDistributed)
mrcnn_mask_deconv      (TimeDistributed)
mrcnn_class_logits     (TimeDistributed)
mrcnn_mask             (TimeDistributed)
Epoch 3/16
200/200 [==============================] - 1188s 6s/step - loss: 1.8613 - rpn_class_loss: 0.0564 - rpn_bbox_loss: 0.4076 - mrcnn_class_loss: 0.3811 - mrcnn_bbox_loss: 0.5534 - mrcnn_mask_loss: 0.4627 - val_loss: 1.7434 - val_rpn_class_loss: 0.0509 - val_rpn_bbox_loss: 0.4280 - val_mrcnn_class_loss: 0.3270 - val_mrcnn_bbox_loss: 0.5140 - val_mrcnn_mask_loss: 0.4235
Epoch 4/16
200/200 [==============================] - 1007s 5s/step - loss: 1.7676 - rpn_class_loss: 0.0524 - rpn_bbox_loss: 0.4023 - mrcnn_class_loss: 0.3411 - mrcnn_bbox_loss: 0.5210 - mrcnn_mask_loss: 0.4506 - val_loss: 1.7750 - val_rpn_class_loss: 0.0566 - val_rpn_bbox_loss: 0.4284 - val_mrcnn_class_loss: 0.3309 - val_mrcnn_bbox_loss: 0.5265 - val_mrcnn_mask_loss: 0.4326
Epoch 5/16
200/200 [==============================] - 1025s 5s/step - loss: 1.7296 - rpn_class_loss: 0.0463 - rpn_bbox_loss: 0.4070 - mrcnn_class_loss: 0.3341 - mrcnn_bbox_loss: 0.5012 - mrcnn_mask_loss: 0.4410 - val_loss: 1.7105 - val_rpn_class_loss: 0.0497 - val_rpn_bbox_loss: 0.4200 - val_mrcnn_class_loss: 0.3536 - val_mrcnn_bbox_loss: 0.4837 - val_mrcnn_mask_loss: 0.4034
Epoch 6/16
200/200 [==============================] - 1024s 5s/step - loss: 1.6932 - rpn_class_loss: 0.0480 - rpn_bbox_loss: 0.3924 - mrcnn_class_loss: 0.3249 - mrcnn_bbox_loss: 0.4926 - mrcnn_mask_loss: 0.4353 - val_loss: 1.6844 - val_rpn_class_loss: 0.0452 - val_rpn_bbox_loss: 0.4121 - val_mrcnn_class_loss: 0.3016 - val_mrcnn_bbox_loss: 0.5108 - val_mrcnn_mask_loss: 0.4146
Epoch 7/16
200/200 [==============================] - 1014s 5s/step - loss: 1.6230 - rpn_class_loss: 0.0433 - rpn_bbox_loss: 0.3713 - mrcnn_class_loss: 0.3082 - mrcnn_bbox_loss: 0.4726 - mrcnn_mask_loss: 0.4276 - val_loss: 1.6988 - val_rpn_class_loss: 0.0474 - val_rpn_bbox_loss: 0.3952 - val_mrcnn_class_loss: 0.3163 - val_mrcnn_bbox_loss: 0.5227 - val_mrcnn_mask_loss: 0.4171
Epoch 8/16
200/200 [==============================] - 987s 5s/step - loss: 1.6290 - rpn_class_loss: 0.0415 - rpn_bbox_loss: 0.3887 - mrcnn_class_loss: 0.3078 - mrcnn_bbox_loss: 0.4666 - mrcnn_mask_loss: 0.4244 - val_loss: 1.5702 - val_rpn_class_loss: 0.0427 - val_rpn_bbox_loss: 0.3952 - val_mrcnn_class_loss: 0.2825 - val_mrcnn_bbox_loss: 0.4538 - val_mrcnn_mask_loss: 0.3960
Epoch 9/16
200/200 [==============================] - 964s 5s/step - loss: 1.5901 - rpn_class_loss: 0.0386 - rpn_bbox_loss: 0.3807 - mrcnn_class_loss: 0.2948 - mrcnn_bbox_loss: 0.4579 - mrcnn_mask_loss: 0.4179 - val_loss: 1.6032 - val_rpn_class_loss: 0.0402 - val_rpn_bbox_loss: 0.4106 - val_mrcnn_class_loss: 0.2986 - val_mrcnn_bbox_loss: 0.4470 - val_mrcnn_mask_loss: 0.4068
Epoch 10/16
200/200 [==============================] - 992s 5s/step - loss: 1.5671 - rpn_class_loss: 0.0382 - rpn_bbox_loss: 0.3786 - mrcnn_class_loss: 0.2808 - mrcnn_bbox_loss: 0.4523 - mrcnn_mask_loss: 0.4172 - val_loss: 1.5760 - val_rpn_class_loss: 0.0408 - val_rpn_bbox_loss: 0.3814 - val_mrcnn_class_loss: 0.2929 - val_mrcnn_bbox_loss: 0.4644 - val_mrcnn_mask_loss: 0.3964
Epoch 11/16
200/200 [==============================] - 1007s 5s/step - loss: 1.5626 - rpn_class_loss: 0.0392 - rpn_bbox_loss: 0.3761 - mrcnn_class_loss: 0.2820 - mrcnn_bbox_loss: 0.4473 - mrcnn_mask_loss: 0.4180 - val_loss: 1.5417 - val_rpn_class_loss: 0.0375 - val_rpn_bbox_loss: 0.3878 - val_mrcnn_class_loss: 0.2625 - val_mrcnn_bbox_loss: 0.4574 - val_mrcnn_mask_loss: 0.3964
Epoch 12/16
200/200 [==============================] - 983s 5s/step - loss: 1.5340 - rpn_class_loss: 0.0364 - rpn_bbox_loss: 0.3602 - mrcnn_class_loss: 0.2814 - mrcnn_bbox_loss: 0.4410 - mrcnn_mask_loss: 0.4150 - val_loss: 1.5480 - val_rpn_class_loss: 0.0388 - val_rpn_bbox_loss: 0.4035 - val_mrcnn_class_loss: 0.2718 - val_mrcnn_bbox_loss: 0.4410 - val_mrcnn_mask_loss: 0.3929
Epoch 13/16
200/200 [==============================] - 999s 5s/step - loss: 1.5124 - rpn_class_loss: 0.0359 - rpn_bbox_loss: 0.3618 - mrcnn_class_loss: 0.2671 - mrcnn_bbox_loss: 0.4361 - mrcnn_mask_loss: 0.4115 - val_loss: 1.5344 - val_rpn_class_loss: 0.0398 - val_rpn_bbox_loss: 0.3917 - val_mrcnn_class_loss: 0.2725 - val_mrcnn_bbox_loss: 0.4356 - val_mrcnn_mask_loss: 0.3947
Epoch 14/16
200/200 [==============================] - 994s 5s/step - loss: 1.5117 - rpn_class_loss: 0.0356 - rpn_bbox_loss: 0.3694 - mrcnn_class_loss: 0.2697 - mrcnn_bbox_loss: 0.4261 - mrcnn_mask_loss: 0.4109 - val_loss: 1.5646 - val_rpn_class_loss: 0.0365 - val_rpn_bbox_loss: 0.4115 - val_mrcnn_class_loss: 0.2697 - val_mrcnn_bbox_loss: 0.4406 - val_mrcnn_mask_loss: 0.4062
Epoch 15/16
200/200 [==============================] - 992s 5s/step - loss: 1.5066 - rpn_class_loss: 0.0340 - rpn_bbox_loss: 0.3571 - mrcnn_class_loss: 0.2786 - mrcnn_bbox_loss: 0.4277 - mrcnn_mask_loss: 0.4093 - val_loss: 1.4983 - val_rpn_class_loss: 0.0344 - val_rpn_bbox_loss: 0.3840 - val_mrcnn_class_loss: 0.2595 - val_mrcnn_bbox_loss: 0.4371 - val_mrcnn_mask_loss: 0.3833
Epoch 16/16
200/200 [==============================] - 985s 5s/step - loss: 1.5362 - rpn_class_loss: 0.0374 - rpn_bbox_loss: 0.3747 - mrcnn_class_loss: 0.2753 - mrcnn_bbox_loss: 0.4350 - mrcnn_mask_loss: 0.4137 - val_loss: 1.5078 - val_rpn_class_loss: 0.0383 - val_rpn_bbox_loss: 0.3674 - val_mrcnn_class_loss: 0.2692 - val_mrcnn_bbox_loss: 0.4367 - val_mrcnn_mask_loss: 0.3961
CPU times: user 45min 10s, sys: 5min 34s, total: 50min 45s
Wall time: 3h 57min 33s

The model ran for a total of 16 epochs with a final training loss of 1.53 and validation loss of 1.50. We see that the loss metric has sequentially reduced indicating that the model did a good job of learning from the training images dataset

Found model /kaggle/working/pneumonia20240622T0431/mask_rcnn_pneumonia_0016.h5
Loading weights from  /kaggle/working/pneumonia20240622T0431/mask_rcnn_pneumonia_0016.h5
Re-starting from epoch 16

Model Evaluation

(256, 256, 3)

*** No instances to display *** 

(256, 256, 3)

*** No instances to display *** 

(256, 256, 3)
(256, 256, 3)

*** No instances to display *** 

(256, 256, 3)

*** No instances to display *** 

(256, 256, 3)

*** No instances to display *** 

No description has been provided for this image

The model was able to identify correctly the lung opacity associated with Pneumonia in the third instance. However, there were a lot of false positives that were detected in the other instances.

c127904f-d321-4d79-b02d-599b73b0a734
[116  60 177 104]
x 240 y 464 h 176 w 244
[ 70 142 174 194]
x 568 y 280 h 208 w 416
[ 68  62 106  99]
x 248 y 272 h 148 w 152
13aad543-dcc2-4083-bcc6-60a2bfe5c9fb
[120  54 157 103]
x 216 y 480 h 196 w 148
[ 79 148 170 204]
x 592 y 316 h 224 w 364
265c655e-b97d-49b0-8b5c-83be37c0b80e
[ 88 150 178 213]
x 600 y 352 h 252 w 360
[178  60 223  99]
x 240 y 712 h 156 w 180
249b1047-ece4-490d-b903-5e457981986a
[109  58 162 114]
x 232 y 436 h 224 w 212
[120 149 179 203]
x 596 y 480 h 216 w 236
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image

Visual Inspection Summary:

  • The model seems to have successfully identified regions that visually appear to be pneumonia in instances 3 and 4.
  • As previously noted, in instance 1 and 2, the model seems to have incorrectly identified regions as pneumonia. However, the visualized predictions might have to be corroborated by involving domain experts for further refinement of the model.

Conclusion

Achieving an 80% accuracy is significant, but continued efforts in model refinement are essential for achieving higher accuracy and reliability. The integration of Mask R-CNN with a ResNet-50 backbone could be a powerful approach for pneumonia detection in medical imaging. This comprehensive approach could contribute to more effective and accurate medical diagnoses.